93 research outputs found
Semantics Altering Modifications for Evaluating Comprehension in Machine Reading
Advances in NLP have yielded impressive results for the task of machine
reading comprehension (MRC), with approaches having been reported to achieve
performance comparable to that of humans. In this paper, we investigate whether
state-of-the-art MRC models are able to correctly process Semantics Altering
Modifications (SAM): linguistically-motivated phenomena that alter the
semantics of a sentence while preserving most of its lexical surface form. We
present a method to automatically generate and align challenge sets featuring
original and altered examples. We further propose a novel evaluation
methodology to correctly assess the capability of MRC systems to process these
examples independent of the data they were optimised on, by discounting for
effects introduced by domain shift. In a large-scale empirical study, we apply
the methodology in order to evaluate extractive MRC models with regard to their
capability to correctly process SAM-enriched data. We comprehensively cover 12
different state-of-the-art neural architecture configurations and four training
datasets and find that -- despite their well-known remarkable performance --
optimised models consistently struggle to correctly process semantically
altered data.Comment: AAAI 2021, final version. 7 pages content + 2 pages reference
A Two-Stage Decoder for Efficient ICD Coding
Clinical notes in healthcare facilities are tagged with the International
Classification of Diseases (ICD) code; a list of classification codes for
medical diagnoses and procedures. ICD coding is a challenging multilabel text
classification problem due to noisy clinical document inputs and long-tailed
label distribution. Recent automated ICD coding efforts improve performance by
encoding medical notes and codes with additional data and knowledge bases.
However, most of them do not reflect how human coders generate the code: first,
the coders select general code categories and then look for specific
subcategories that are relevant to a patient's condition. Inspired by this, we
propose a two-stage decoding mechanism to predict ICD codes. Our model uses the
hierarchical properties of the codes to split the prediction into two steps: At
first, we predict the parent code and then predict the child code based on the
previous prediction. Experiments on the public MIMIC-III data set show that our
model performs well in single-model settings without external data or
knowledge.Comment: Accepted to ACL'2
A Framework for Evaluation of Machine Reading Comprehension Gold Standards
Machine Reading Comprehension (MRC) is the task of answering a question over
a paragraph of text. While neural MRC systems gain popularity and achieve
noticeable performance, issues are being raised with the methodology used to
establish their performance, particularly concerning the data design of gold
standards that are used to evaluate them. There is but a limited understanding
of the challenges present in this data, which makes it hard to draw comparisons
and formulate reliable hypotheses. As a first step towards alleviating the
problem, this paper proposes a unifying framework to systematically investigate
the present linguistic features, required reasoning and background knowledge
and factual correctness on one hand, and the presence of lexical cues as a
lower bound for the requirement of understanding on the other hand. We propose
a qualitative annotation schema for the first and a set of approximative
metrics for the latter. In a first application of the framework, we analyse
modern MRC gold standards and present our findings: the absence of features
that contribute towards lexical ambiguity, the varying factual correctness of
the expected answers and the presence of lexical cues, all of which potentially
lower the reading comprehension complexity and quality of the evaluation data.Comment: In Proceedings of the 12th International Conference on Language
Resources and Evaluation (LREC 2020
A Survey of z~6 Quasars in the SDSS Deep Stripe. II. Discovery of Six Quasars at z_{AB}>21
We present the discovery of six new quasars at z~6 selected from the Sloan
Digital Sky Survey (SDSS) southern survey, a deep imaging survey obtained by
repeatedly scanning a stripe along the celestial equator. The six quasars are
about two magnitudes fainter than the luminous z~6 quasars found in the SDSS
main survey and one magnitude fainter than the quasars reported in Paper I
(Jiang et al. 2008). Four of them comprise a complete flux-limited sample at
21<z_AB<21.8 over an effective area of 195 deg^2. The other two quasars are
fainter than z_AB=22 and are not part of the complete sample. The quasar
luminosity function at z~6 is well described as a single power law
\Phi(L_{1450}) \propto L_{1450}^{\beta} over the luminosity range
-28<M_{1450}<-25. The best-fitting slope \beta varies from -2.6 to -3.1,
depending on the quasar samples used, with a statistical error of 0.3-0.4.
About 40% of the quasars discovered in the SDSS southern survey have very
narrow Lya emission lines, which may indicate small black hole masses and high
Eddington luminosity ratios, and therefore short black hole growth time scales
for these faint quasars at early epochs.Comment: Accepted for publication in A
Update on the Nature of Virgo Overdensity
We use the Eighth Data Release of Sloan Digital Sky Survey (SDSS DR8) catalog
with its additional sky coverage of the southern Galactic hemisphere, to
measure the extent and study the nature of the Virgo Overdensity (VOD; Juric et
al. 2008). The data show that the VOD extends over no less than 2000 deg^2,
with its true extent likely closer to 3000 deg^2. We test whether the VOD can
be attributed to a tilt in the stellar halo ellipsoid with respect to the plane
of the Galactic disk and find that the observed symmetry of the north-south
Galactic hemisphere star counts excludes this possibility. We argue that the
Virgo Overdensity, in spite of its wide area and cloud-like appearance, is
still best explained by a minor merger. Its appearance and position is
qualitatively similar to a near perigalacticon merger event and, assuming that
the VOD and the Virgo Stellar Stream share the same progenitor, consistent with
the VSS orbit determined by Casetti-Dinescu et al. (2009).Comment: 9 pages,6 figures; accepted for publication in A
- …